Aberdeen
Revealed: The UK streets with the slowest broadband - so, is YOUR road on the list?
Broncos quarterback Bo Nix suffers broken ankle in win over Bills as he's ruled out of NFL playoffs Ilhan Omar is under investigation for her skyrocketing wealth... as she berates reporters for questioning her about'fraud' Trump puts $1 BILLION price tag on membership for his new'UN replacement'... and the president'will control ALL the money' Iconic '90s femme fatale Men In Black star hasn't been seen in 16 years... now the Daily Mail reveals distressing truth behind her disappearance Investigator reveals hidden clues in Ellen Greenberg's crime scene photos that PROVE bride-to-be was brutally murdered Trump's Greenland tariff squeeze detonates Europe trade deal as NATO is pushed to breaking point Nicole Kidman's subtle but devastating digs at Keith Urban revealed... as insiders claim country star has MOVED IN with new squeeze Infectious disease expert reveals viruses to worry about as'super flu' overwhelms US... including one that could put the world'on cusp of a pandemic' The'marry me' sex move that'll make even the most commitment-phobic of men beg to see you again... and it worked for THREE of my friends Jane Fonda, 88, is pushed in wheelchair at airport after Reiner murders left her'reeling' Criminal investigation launched into Renee Good's wife for'impeding' ICE agents before shooting CBS News's star anchor caught on tape caving to Trump's demands as president issues blunt two-word warning over interview'edits' A-list pop star is unrecognizable in Saturday Night Live cameo... can YOU guess who she is? Devastated Princess Eugenie has'cut off all contact' with disgraced father Andrew Mountbatten-Windsor over Epstein scandal NFL fans fume Bills-Broncos was'rigged' as controversial late call sparks debate: 'Completely scripted' Secrets of one of America's oldest grocery stores that shuns self-checkouts and welcomes rich and famous customers Ansel Elgort becomes first time dad as he's seen carrying newborn baby in New York Secret ranking of NFL WAGs revealed: From a'jealous' ex-cheerleader to the'annoying queen'... meet the stunning sideline spouses raking in MILLIONS Read Melissa Gilbert's begging letter in defense of husband Timothy Busfield as she claims West Wing star is'honorable and compassionate' despite child sex allegations Revealed: The UK streets with the slowest broadband - so, is YOUR road on the list? You might feel like your home's internet connection is painfully slow, but experts have now revealed which neighbourhoods really have Britain's worst broadband. New research conducted by Broadband Genie compiled over 145,000 speed tests from users across the UK to find Britain's slowest streets. And it is bad news for the residents of Heol-Y-Fedw in Port Talbot, who face download speeds of 0.81 megabytes per second, the slowest of any street in the UK.
- North America > United States > New York (0.25)
- North America > Greenland (0.24)
- North America > Canada > Alberta (0.14)
- (23 more...)
- Media > Television (1.00)
- Media > Music (1.00)
- Leisure & Entertainment > Sports > Football (1.00)
- (4 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Communications > Mobile (0.68)
A Unified Definition of Hallucination, Or: It's the World Model, Stupid
Liu, Emmy, Gangal, Varun, Zou, Chelsea, Huang, Xiaoqi, Yu, Michael, Chang, Alex, Tao, Zhuofu, Kumar, Sachin, Feng, Steven Y.
Despite numerous attempts to solve the issue of hallucination since the inception of neural language models, it remains a problem in even frontier large language models today. Why is this the case? We walk through definitions of hallucination used in the literature from a historical perspective up to the current day, and fold them into a single definition of hallucination, wherein different prior definitions focus on different aspects of our definition. At its core, we argue that hallucination is simply inaccurate (internal) world modeling, in a form where it is observable to the user (e.g., stating a fact which contradicts a knowledge base, or producing a summary which contradicts a known source). By varying the reference world model as well as the knowledge conflict policy (e.g., knowledge base vs. in-context), we arrive at the different existing definitions of hallucination present in the literature. We argue that this unified view is useful because it forces evaluations to make clear their assumed "world" or source of truth, clarifies what should and should not be called hallucination (as opposed to planning or reward/incentive-related errors), and provides a common language to compare benchmarks and mitigation techniques. Building on this definition, we outline plans for a family of benchmarks in which hallucinations are defined as mismatches with synthetic but fully specified world models in different environments, and sketch out how these benchmarks can use such settings to stress-test and improve the world modeling components of language models.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Ohio (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (4 more...)
- Research Report (0.50)
- Personal > Honors (0.47)
Learning to Code with Context: A Study-Based Approach
Borghoff, Uwe M., Minas, Mark, Schopp, Jannis
The rapid emergence of generative AI tools is transforming the way software is developed. Consequently, software engineering education must adapt to ensure that students not only learn traditional development methods but also understand how to meaningfully and responsibly use these new technologies. In particular, project-based courses offer an effective environment to explore and evaluate the integration of AI assistance into real-world development practices. This paper presents our approach and a user study conducted within a university programming project in which students collaboratively developed computer games. The study investigates how participants used generative AI tools throughout different phases of the software development process, identifies the types of tasks where such tools were most effective, and analyzes the challenges students encountered. Building on these insights, we further examine a repository-aware, locally deployed large language model (LLM) assistant designed to provide project-contextualized support. The system employs Retrieval-Augmented Generation (RAG) to ground responses in relevant documentation and source code, enabling qualitative analysis of model behavior, parameter sensitivity, and common failure modes. The findings deepen our understanding of context-aware AI support in educational software projects and inform future integration of AI-based assistance into software engineering curricula.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
- Overview (0.92)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.69)
When LLMs Can't Help: Real-World Evaluation of LLMs in Nutrition
Li, Karen Jia-Hui, Balloccu, Simone, Dusek, Ondrej, Reiter, Ehud
The increasing trust in large language models (LLMs), especially in the form of chatbots, is often undermined by the lack of their extrinsic evaluation. This holds particularly true in nutrition, where randomised controlled trials (RCTs) are the gold standard, and experts demand them for evidence-based deployment. LLMs have shown promising results in this field, but these are limited to intrinsic setups. We address this gap by running the first RCT involving LLMs for nutrition. We augment a rule-based chatbot with two LLM-based features: (1) message rephrasing for conversational variety and engagement, and (2) nutritional counselling through a fine-tuned model. In our seven-week RCT (n=81), we compare chatbot variants with and without LLM integration. We measure effects on dietary outcome, emotional well-being, and engagement. Despite our LLM-based features performing well in intrinsic evaluation, we find that they did not yield consistent benefits in real-world deployment. These results highlight critical gaps between intrinsic evaluations and real-world impact, emphasising the need for interdisciplinary, human-centred approaches.\footnote{We provide all of our code and results at: \\ \href{https://github.com/saeshyra/diet-chatbot-trial}{https://github.com/saeshyra/diet-chatbot-trial}}
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Europe > United Kingdom > Scotland > City of Aberdeen > Aberdeen (0.04)
- (5 more...)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Consumer Health (1.00)
- Education > Health & Safety > School Nutrition (1.00)
- Health & Medicine > Health Care Technology (0.68)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)
Pushdown Reward Machines for Reinforcement Learning
Varricchione, Giovanni, Klassen, Toryn Q., Alechina, Natasha, Dastani, Mehdi, Logan, Brian, McIlraith, Sheila A.
Reward machines (RMs) are automata structures that encode (non-Markovian) reward functions for reinforcement learning (RL). RMs can reward any behaviour representable in regular languages and, when paired with RL algorithms that exploit RM structure, have been shown to significantly improve sample efficiency in many domains. In this work, we present pushdown reward machines (pdRMs), an extension of reward machines based on deterministic pushdown automata. pdRMs can recognise and reward temporally extended behaviours representable in deterministic context-free languages, making them more expressive than reward machines. We introduce two variants of pdRM-based policies, one which has access to the entire stack of the pdRM, and one which can only access the top $k$ symbols (for a given constant $k$) of the stack. We propose a procedure to check when the two kinds of policies (for a given environment, pdRM, and constant $k$) achieve the same optimal state values. We then provide theoretical results establishing the expressive power of pdRMs, and space complexity results for the proposed learning problems. Lastly, we propose an approach for off-policy RL algorithms that exploits counterfactual experiences with pdRMs. We conclude by providing experimental results showing how agents can be trained to perform tasks representable in deterministic context-free languages using pdRMs.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Netherlands (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- (3 more...)
- Government (0.46)
- Education (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
Low-cost Multi-agent Fleet for Acoustic Cooperative Localization Research
Durrant, Nelson, Meyers, Braden, McMurray, Matthew, Smith, Clayton, Anderson, Brighton, Hodgins, Tristan, Velasco, Kalliyan, Mangelson, Joshua G.
Abstract-- Real-world underwater testing for multi-agent autonomy presents substantial financial and engineering challenges. In this work, we introduce the Configurable Underwater Group of Autonomous Robots (CoUGARs) as a low-cost, configurable autonomous-underwater-vehicle (AUV) platform for multi-agent autonomy research. The base design costs less than $3,000 USD (as of May 2025) and is based on commercially-available and 3D-printed parts, enabling quick customization for various sensor payloads and configurations. Our current expanded model is equipped with a doppler velocity log (DVL) and ultra-short-baseline (USBL) acoustic array/transducer to support research on acoustic-based cooperative localization. State estimation, navigation, and acoustic communications software has been developed and deployed using a containerized software stack and is tightly integrated with the HoloOcean simulator . The system was tested both in simulation and via in-situ field trials in Utah lakes and reservoirs. Effective state estimation for underwater robotics is a challenging problem that is actively being addressed in academic circles.
- North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
- North America > United States > Utah > Utah County > Spanish Fork (0.04)
- North America > United States > Utah > Utah County > Provo (0.04)
- (10 more...)
- Energy (0.68)
- Machinery > Industrial Machinery (0.49)
- Government > Military (0.46)
From BERT to LLMs: Comparing and Understanding Chinese Classifier Prediction in Language Models
Zhang, Ziqi, Ma, Jianfei, Chersoni, Emmanuele, You, Jieshun, Feng, Zhaoxin
Classifiers are an important and defining feature of the Chinese language, and their correct prediction is key to numerous educational applications. Yet, whether the most popular Large Language Models (LLMs) possess proper knowledge the Chinese classifiers is an issue that has largely remain unexplored in the Natural Language Processing (NLP) literature. To address such a question, we employ various masking strategies to evaluate the LLMs' intrinsic ability, the contribution of different sentence elements, and the working of the attention mechanisms during prediction. Besides, we explore fine-tuning for LLMs to enhance the classifier performance. Our findings reveal that LLMs perform worse than BERT, even with fine-tuning. The prediction, as expected, greatly benefits from the information about the following noun, which also explains the advantage of models with a bidirectional attention mechanism such as BERT.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Austria > Vienna (0.14)
- (13 more...)
Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model's Empathy
Malik, Ananya, Sabri, Nazanin, Karnaze, Melissa, Elsherief, Mai
Large Language Models' (LLMs) ability to converse naturally is empowered by their ability to empathetically understand and respond to their users. However, emotional experiences are shaped by demographic and cultural contexts. This raises an important question: Can LLMs demonstrate equitable empathy across diverse user groups? We propose a framework to investigate how LLMs' cognitive and affective empathy vary across user personas defined by intersecting demographic attributes. Our study introduces a novel intersectional analysis spanning 315 unique personas, constructed from combinations of age, culture, and gender, across four LLMs. Results show that attributes profoundly shape a model's empathetic responses. Interestingly, we see that adding multiple attributes at once can attenuate and reverse expected empathy patterns. We show that they broadly reflect real-world empathetic trends, with notable misalignments for certain groups, such as those from Confucian culture. We complement our quantitative findings with qualitative insights to uncover model behaviour patterns across different demographic groups. Our findings highlight the importance of designing empathy-aware LLMs that account for demographic diversity to promote more inclusive and equitable model behaviour.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- South America (0.05)
- North America > Central America (0.05)
- (5 more...)
From Multimodal Perception to Strategic Reasoning: A Survey on AI-Generated Game Commentary
Zheng, Qirui, Wang, Xingbo, Cheng, Keyuan, Ali, Muhammad Asif, Lu, Yunlong, Li, Wenxin
The advent of artificial intelligence has propelled AI-Generated Game Commentary (AI-GGC) into a rapidly expanding field, offering benefits such as unlimited availability and personalized narration. However, current researches in this area remain fragmented, and a comprehensive survey that systematically unifies existing efforts is still missing. To bridge this gap, our survey introduces a unified framework that systematically organizes the AI-GGC landscape. We present a novel taxonomy focused on three core commentator capabilities: Live Observation, Strategic Analysis, and Historical Recall. Commentary is further categorized into three functional types: Descriptive, Analytical, and Background. Building on this structure, we provide an in-depth review of state-of-the-art methods, datasets, and evaluation metrics across various game genres. Finally, we highlight key challenges such as real-time reasoning, multimodal integration, and evaluation bottlenecks, and outline promising directions for future research and system development in AI-GGC.
- Europe > Czechia > Prague (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- (17 more...)
- Research Report (1.00)
- Overview (1.00)
- Leisure & Entertainment > Sports > Soccer (0.95)
- Leisure & Entertainment > Games > Computer Games (0.68)
- Leisure & Entertainment > Sports > Basketball (0.67)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Time Is Effort: Estimating Human Post-Editing Time for Grammar Error Correction Tool Evaluation
Vadehra, Ankit, Johnson, Bill, Saunders, Gene, Poupart, Pascal
Text editing can involve several iterations of revision. Incorporating an efficient Grammar Error Correction (GEC) tool in the initial correction round can significantly impact further human editing effort and final text quality. This raises an interesting question to quantify GEC Tool usability: How much effort can the GEC Tool save users? We present the first large-scale dataset of post-editing (PE) time annotations and corrections for two English GEC test datasets (BEA19 and CoNLL14). We introduce Post-Editing Effort in Time (PEET) for GEC Tools as a human-focused evaluation scorer to rank any GEC Tool by estimating PE time-to-correct. Using our dataset, we quantify the amount of time saved by GEC Tools in text editing. Analyzing the edit type indicated that determining whether a sentence needs correction and edits like paraphrasing and punctuation changes had the greatest impact on PE time. Finally, comparison with human rankings shows that PEET correlates well with technical effort judgment, providing a new human-centric direction for evaluating GEC tool usability. We release our dataset and code at: https://github.com/ankitvad/PEET_Scorer.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- (15 more...)
- Research Report (0.82)
- Overview (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
- (3 more...)